Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Refactor code to add is_view method for Series. #5853

Merged
2 commits merged into from
Jan 5, 2014

Conversation

jseabold
Copy link
Contributor

@jseabold jseabold commented Jan 5, 2014

So you can check if a series is a view instead of waiting to get an error on sort or copying by default.

@ghost
Copy link

ghost commented Jan 5, 2014

This looks ok to me but it needs a docstring.

@jseabold
Copy link
Contributor Author

jseabold commented Jan 5, 2014

Right. Done.

@ghost
Copy link

ghost commented Jan 5, 2014

This can go in 0.13.1, need to test for the exception being raised and add a release note, i'll merge.

@jseabold
Copy link
Contributor Author

jseabold commented Jan 5, 2014

The exception code was already there, so I assume it works. Where should I add the release note?

@ghost
Copy link

ghost commented Jan 5, 2014

doc/source/[v0.13.1.txt,whatsnew.rst].

ghost pushed a commit that referenced this pull request Jan 5, 2014
ENH: Refactor code to add is_view method for Series.
@ghost ghost merged commit 91d6f1b into pandas-dev:master Jan 5, 2014
@jtratner
Copy link
Contributor

jtratner commented Jan 5, 2014

I'm confused. Isn't this opposed to the logic of the SettingWithCopy errors?

Why isn't the resolution to make the BlockManager copy the array in place before doing the sort? (views share BlockManagers right?)

@jtratner
Copy link
Contributor

jtratner commented Jan 5, 2014

Or maybe I'm not clear why this is a public method vs method on the SingleBlockManager.

@ghost
Copy link

ghost commented Jan 5, 2014

I'm confused too. What does providing an is_view method on series have to do with any of that?

@jtratner
Copy link
Contributor

jtratner commented Jan 5, 2014

Okay, let me take a step back: what's the rationale for making this a public method?

@jreback
Copy link
Contributor

jreback commented Jan 5, 2014

iIIRC the view stuff needs to be taken out
is_copy needs to be set (if needed, eg the series was actually sorted)

sort/order/sort_index have an open issue to have their API fixed iirc

@ghost
Copy link

ghost commented Jan 5, 2014

Ok, I haven't kept up with those changes and considered this a harmless addition.
If I was too hasty, we can revert.

@jtratner
Copy link
Contributor

jtratner commented Jan 5, 2014

Can we revert it temporarily and revisit after we push 0.13.1?

@ghost
Copy link

ghost commented Jan 5, 2014

Sure, but before that, can you explain the issue in a way I can grok?
I feel like I stepped in something I don't understand, if you'll excuse the
mixed metaphor.

@jtratner
Copy link
Contributor

jtratner commented Jan 5, 2014

Issue that arose often was that people used chained assignment and accidentally set values on copies (accomplishing nothing). @jreback added a SettingWithCopy warning that checks for setting values on non-explicit copies (ie not from copy() or methods that return a new object).

So now the general instruction is that you want to work with views not (accidental) copies. This method suggests the opposite mindset.

It also exposes the user to a numpy implementation detail that (I'm speculating) doesn't make sense for pandas.

@jreback
Copy link
Contributor

jreback commented Jan 5, 2014

you almost always get a new pandas object from operations (except when they r inplace)
prob is various chained assignments can't know this because of the way python evaluates statelessly

so an assignment could be working on a non explicit copy

setting is_copy on cross sectional ops like take (which sort effectively does)
makes this clear to the following operations u do with the object

normally will not matter/trigger except when a copy is for sure created (eg u know it's not a view)
and the. u try to set it

(the false positive is you are actually NOT doing a chained assignment)

pandas differs from numpy here because we can make a guarantee that an op returns a copy
just trying to make it explicit where it can be so that subsequent operations can look to that guarantee and provide a warning (if needed)

@ghost
Copy link

ghost commented Jan 5, 2014

I had a suspicion that part of the issue was an argument that this reifies the connection
to numpy arrays, that ship has long sailed IMO. In fact we usually present it as a feature.

The chained assignment issue and fix I'm aware of (as well as some recoil by users on the warnings,
@jseabold too in fact). how is_view suggests you use copies instead of a view I'm not sure, nor how
you can use a view instead of a copy ("the general instruction") if what you want to do is mutate the array.

I'm probably just missing 600 comments or so in other issues, too late for that.
I'll revert and open an issue for 0.14 referencing this. seems I no longer have
the overall view (bad pun) needed to make an informed decision, I'll steer clear
in the future.

ghost pushed a commit that referenced this pull request Jan 5, 2014
Following discussion in GH5853, reverting the change to revisit
in 0.14.

This reverts commit 91d6f1b, reversing
changes made to ee92c75.
@ghost ghost mentioned this pull request Jan 5, 2014
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants